26 research outputs found
The World of Fast Moving Objects
The notion of a Fast Moving Object (FMO), i.e. an object that moves over a
distance exceeding its size within the exposure time, is introduced. FMOs may,
and typically do, rotate with high angular speed. FMOs are very common in
sports videos, but are not rare elsewhere. In a single frame, such objects are
often barely visible and appear as semi-transparent streaks.
A method for the detection and tracking of FMOs is proposed. The method
consists of three distinct algorithms, which form an efficient localization
pipeline that operates successfully in a broad range of conditions. We show
that it is possible to recover the appearance of the object and its axis of
rotation, despite its blurred appearance. The proposed method is evaluated on a
new annotated dataset. The results show that existing trackers are inadequate
for the problem of FMO localization and a new approach is required. Two
applications of localization, temporal super-resolution and highlighting, are
presented
Sub-frame Appearance and 6D Pose Estimation of Fast Moving Objects
We propose a novel method that tracks fast moving objects, mainly non-uniform
spherical, in full 6 degrees of freedom, estimating simultaneously their 3D
motion trajectory, 3D pose and object appearance changes with a time step that
is a fraction of the video frame exposure time. The sub-frame object
localization and appearance estimation allows realistic temporal
super-resolution and precise shape estimation. The method, called TbD-3D
(Tracking by Deblatting in 3D) relies on a novel reconstruction algorithm which
solves a piece-wise deblurring and matting problem. The 3D rotation is
estimated by minimizing the reprojection error. As a second contribution, we
present a new challenging dataset with fast moving objects that change their
appearance and distance to the camera. High speed camera recordings with zero
lag between frame exposures were used to generate videos with different frame
rates annotated with ground-truth trajectory and pose
NeRD: Neural field-based Demosaicking
We introduce NeRD, a new demosaicking method for generating full-color images
from Bayer patterns. Our approach leverages advancements in neural fields to
perform demosaicking by representing an image as a coordinate-based neural
network with sine activation functions. The inputs to the network are spatial
coordinates and a low-resolution Bayer pattern, while the outputs are the
corresponding RGB values. An encoder network, which is a blend of ResNet and
U-net, enhances the implicit neural representation of the image to improve its
quality and ensure spatial consistency through prior learning. Our experimental
results demonstrate that NeRD outperforms traditional and state-of-the-art
CNN-based methods and significantly closes the gap to transformer-based
methods.Comment: 5 pages, 4 figures, 1 tabl
H-NeXt: The next step towards roto-translation invariant networks
The widespread popularity of equivariant networks underscores the
significance of parameter efficient models and effective use of training data.
At a time when robustness to unseen deformations is becoming increasingly
important, we present H-NeXt, which bridges the gap between equivariance and
invariance. H-NeXt is a parameter-efficient roto-translation invariant network
that is trained without a single augmented image in the training set. Our
network comprises three components: an equivariant backbone for learning
roto-translation independent features, an invariant pooling layer for
discarding roto-translation information, and a classification layer. H-NeXt
outperforms the state of the art in classification on unaugmented training sets
and augmented test sets of MNIST and CIFAR-10.Comment: Appears in British Machine Vision Conference 2023 (BMVC 2023
Blur Invariants for Image Recognition
Blur is an image degradation that is difficult to remove. Invariants with
respect to blur offer an alternative way of a~description and recognition of
blurred images without any deblurring. In this paper, we present an original
unified theory of blur invariants. Unlike all previous attempts, the new theory
does not require any prior knowledge of the blur type. The invariants are
constructed in the Fourier domain by means of orthogonal projection operators
and moment expansion is used for efficient and stable computation. It is shown
that all blur invariants published earlier are just particular cases of this
approach. Experimental comparison to concurrent approaches shows the advantages
of the proposed theory.Comment: 15 page
Real-Time Wheel Detection and Rim Classification in Automotive Production
This paper proposes a novel approach to real-time automatic rim detection,
classification, and inspection by combining traditional computer vision and
deep learning techniques. At the end of every automotive assembly line, a
quality control process is carried out to identify any potential defects in the
produced cars. Common yet hazardous defects are related, for example, to
incorrectly mounted rims. Routine inspections are mostly conducted by human
workers that are negatively affected by factors such as fatigue or distraction.
We have designed a new prototype to validate whether all four wheels on a
single car match in size and type. Additionally, we present three comprehensive
open-source databases, CWD1500, WHEEL22, and RB600, for wheel, rim, and bolt
detection, as well as rim classification, which are free-to-use for scientific
purposes.Comment: 5 pages, 7 figures, 3 table
Temporal Ordering in Endocytic Clathrin-Coated Vesicle Formation via AP2 Phosphorylation.
Clathrin-mediated endocytosis (CME) is key to maintaining the transmembrane protein composition of cells' limiting membranes. During mammalian CME, a reversible phosphorylation event occurs on Thr156 of the μ2 subunit of the main endocytic clathrin adaptor, AP2. We show that this phosphorylation event starts during clathrin-coated pit (CCP) initiation and increases throughout CCP lifetime. μ2Thr156 phosphorylation favors a new, cargo-bound conformation of AP2 and simultaneously creates a binding platform for the endocytic NECAP proteins but without significantly altering AP2's cargo affinity in vitro. We describe the structural bases of both. NECAP arrival at CCPs parallels that of clathrin and increases with μ2Thr156 phosphorylation. In turn, NECAP recruits drivers of late stages of CCP formation, including SNX9, via a site distinct from where NECAP binds AP2. Disruption of the different modules of this phosphorylation-based temporal regulatory system results in CCP maturation being delayed and/or stalled, hence impairing global rates of CME
Image fusion via multichannel blind deconvolution.
The thesis focuses on fusion of degraded images originating from one source with the aim of obtaining an undergraded image of the source. Depending on the type of the degradation, a formalized system of the most common fusion problems is built. The unknown degradation we deal with is additive noise and space-invariant blurs modeled by convolution. The fusion process is then referred to as multichannel blind deconvolution and it frequently occurs in microscopy imaging, remote sensing, astronomical imaging, etc. A novel iterative algorithm is proposed which solves an energy minimization problem by means of an alternating minimization scheme. The energy functional, which is utilized here, incorporates regularization of the original image and blurs. Anisotropic regularization of the image based on total variation and the Mumford-Shah functional is implemented. Regularization of the blurs emanates from the multichannel framework of the problem, in particular from mutual relations between channels degraded with different blurs. A better restoration performance was achieved in comparison with previously proposed multichannel blind approaches. Primarily, an enhanced noise robustness was observed. An accurate estimation of the blur size is however necessary.Available from STL Prague, CZ / NTK - National Technical LibrarySIGLECZCzech Republi